Reads were aligned to the hg38 assembly using BWA (0.7.10-r789). The following alignment QC report was produced:

GSE95632_DiffBind_Report.html

MACS2 (2.1.1) function callpeak was used to call peaks. Peaks within blacklisted regions were filtered out.

DiffBind (2.6.6) was used for differential binding site analaysis, based on the the phenotype file provided. Normalized counts are saved in the following text file:

GSE95632_Condition1_vs_Condition0_counts_normalized_by_diffbind.csv

Differential binding site analysis was done for all comparisons provided in the comparisons file, using DESeq2 by default.

Peaks were annotated using corresponding R packages org.Hs.eg.db and TxDb.Hsapiens.UCSC.hg38.knownGene by ChIPseeker (1.14.2).

Create bam read columns

Create bam control columns

Create peak bed columns

GR_dex vs. GR_EtOH

Samples in this comparison

Prepare diffbind csv input file

Major steps

  1. Perform differential binding site analysis.

Check if there are replicates in each condition.

## nrep= 2

If there are replicates in each condition, perform following steps:

Obtain consensus peaks in each comparison condition

## 6 Samples, 81269 sites in matrix (104565 total):
##           ID Tissue Factor Condition Treatment Replicate Caller Intervals
## 1 SRR5309351    ASM     GR     cond0      EtOH         1    bed     68779
## 2 SRR5309353    ASM     GR     cond0      EtOH         2    bed     66637
## 3 SRR5309354    ASM     GR     cond1       Dex         1    bed     70852
## 4 SRR5309356    ASM     GR     cond1       Dex         2    bed     79376
## 5      cond0    ASM     GR     cond0      EtOH       1-2    bed     57139
## 6      cond1    ASM     GR     cond1       Dex       1-2    bed     63064

Obtain union peaks from consensus peaks of each condition

## 3 Samples, 78325 sites in matrix:
##            ID Tissue Factor   Condition Treatment Replicate Caller Intervals
## 1       cond0    ASM     GR       cond0      EtOH       1-2    bed     57139
## 2       cond1    ASM     GR       cond1       Dex       1-2    bed     63064
## 3 cond0-cond1    ASM     GR cond0-cond1  EtOH-Dex       1-2    bed     78325

Use counts from union peaksets for differential binding analysis

If there is no replicate in each condition, use union peaks from two conditions for differential binding analysis. Without replicates, the significance estimation does not make any sense.

  1. Annotate gene regions to peaks

  2. Save differential results and count matrix. If there are replicates in each condition, only save significant binding sites; otherwise save all binding sites.

Top 50 differential binding sites

If there are replicates in each condition, order by p-values; otherwise, other by absolute effect size

Description of annotated DiffBind output

Table above shows the selected columns. Here is the description of the full output:

Volcano plots

Volcano plot (probes with a q-value <0.05 are present in red)

MA plot

Dendrogram based on log2 normalized data

Heatmaps for top 30 significant binding sites

Binding sites were ranked by adjusted p-values.

Principal Component Analysis (PCA) Plot

Compute PCs and variance explained by the first 10 PCs

Variance explained
PC Proportion of Variance (%) Cumulative Proportion of Variance (%)
PC1 83.24 83.24
PC2 16.27 99.51
PC3 0.2627 99.77
PC4 0.2289 100

PCA plots are generated using the first two principle components colored by known factors (e.g. Status, Tissue, or Donor)

Boxplots for top 20 differentially binding sites

Binding sites were ranked by pvalue. Counts have been normalized by sequencing depth. Apply log2-transformed values from plotting.

Profile of peaks binding to TSS regions

For studies with replicates, if more than 500 significant peaks were identified, only show significant binding peaks; otherwise, show all peaks.

For studies without replicates, show all peaks.

Heatmap of ChIP binding to TSS regions (left).

Average Profile of ChIP peaks binding to TSS region (right).

## Use significant binding sites only.

Distribution of significant peaks’ genomic annotation

Distribution of TF-binding loci relative to TSS

RNAP2_dex vs. RNAP2_EtOH

Samples in this comparison

Prepare diffbind csv input file

Major steps

  1. Perform differential binding site analysis.

Check if there are replicates in each condition.

## nrep= 2

If there are replicates in each condition, perform following steps:

Obtain consensus peaks in each comparison condition

## 6 Samples, 25723 sites in matrix (41629 total):
##           ID Tissue Factor Condition Treatment Replicate Caller Intervals
## 1 SRR5309361    ASM  RNAP2     cond0      EtOH         1    bed     25843
## 2 SRR5309362    ASM  RNAP2     cond0      EtOH         2    bed     34451
## 3 SRR5309363    ASM  RNAP2     cond1       Dex         1    bed     28583
## 4 SRR5309364    ASM  RNAP2     cond1       Dex         2    bed     28447
## 5      cond0    ASM  RNAP2     cond0      EtOH       1-2    bed     20426
## 6      cond1    ASM  RNAP2     cond1       Dex       1-2    bed     20547

Obtain union peaks from consensus peaks of each condition

## 3 Samples, 23804 sites in matrix:
##            ID Tissue Factor   Condition Treatment Replicate Caller Intervals
## 1       cond0    ASM  RNAP2       cond0      EtOH       1-2    bed     20426
## 2       cond1    ASM  RNAP2       cond1       Dex       1-2    bed     20547
## 3 cond0-cond1    ASM  RNAP2 cond0-cond1  EtOH-Dex       1-2    bed     23804

Use counts from union peaksets for differential binding analysis

If there is no replicate in each condition, use union peaks from two conditions for differential binding analysis. Without replicates, the significance estimation does not make any sense.

  1. Annotate gene regions to peaks

  2. Save differential results and count matrix. If there are replicates in each condition, only save significant binding sites; otherwise save all binding sites.

Top 50 differential binding sites

If there are replicates in each condition, order by p-values; otherwise, other by absolute effect size

Description of annotated DiffBind output

Table above shows the selected columns. Here is the description of the full output:

Volcano plots

Volcano plot (probes with a q-value <0.05 are present in red)

MA plot

Dendrogram based on log2 normalized data

Heatmaps for top 30 significant binding sites

Binding sites were ranked by adjusted p-values.

Principal Component Analysis (PCA) Plot

Compute PCs and variance explained by the first 10 PCs

Variance explained
PC Proportion of Variance (%) Cumulative Proportion of Variance (%)
PC1 93.09 93.09
PC2 6.71 99.8
PC3 0.1157 99.92
PC4 0.0841 100

PCA plots are generated using the first two principle components colored by known factors (e.g. Status, Tissue, or Donor)

Boxplots for top 20 differentially binding sites

Binding sites were ranked by pvalue. Counts have been normalized by sequencing depth. Apply log2-transformed values from plotting.

Profile of peaks binding to TSS regions

For studies with replicates, if more than 500 significant peaks were identified, only show significant binding peaks; otherwise, show all peaks.

For studies without replicates, show all peaks.

Heatmap of ChIP binding to TSS regions (left).

Average Profile of ChIP peaks binding to TSS region (right).

## Use significant binding sites only.

Distribution of significant peaks’ genomic annotation

Distribution of TF-binding loci relative to TSS

R version 3.4.3 (2017-11-30)

Platform: x86_64-redhat-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C

attached base packages: parallel, stats4, stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: bindrcpp(v.0.2.2), pander(v.0.6.1), viridis(v.0.5.0), viridisLite(v.0.3.0), RColorBrewer(v.1.1-2), gplots(v.3.0.1), ggplot2(v.3.0.0), devtools(v.1.13.5), DT(v.0.4), tidyr(v.0.8.1), ChIPseeker(v.1.14.2), DiffBind(v.2.6.6), SummarizedExperiment(v.1.8.1), DelayedArray(v.0.4.1), matrixStats(v.0.54.0), TxDb.Hsapiens.UCSC.hg38.knownGene(v.3.4.0), GenomicFeatures(v.1.30.3), GenomicRanges(v.1.30.3), GenomeInfoDb(v.1.14.0), org.Hs.eg.db(v.3.5.0), AnnotationDbi(v.1.40.0), IRanges(v.2.12.0), S4Vectors(v.0.16.0), Biobase(v.2.38.0), BiocGenerics(v.0.24.0) and rmarkdown(v.1.9)

loaded via a namespace (and not attached): backports(v.1.1.2), GOstats(v.2.44.0), Hmisc(v.4.1-1), fastmatch(v.1.1-0), plyr(v.1.8.4), igraph(v.1.2.2), lazyeval(v.0.2.1), GSEABase(v.1.40.1), splines(v.3.4.3), BatchJobs(v.1.7), BiocParallel(v.1.12.0), crosstalk(v.1.0.0), gridBase(v.0.4-7), amap(v.0.8-16), digest(v.0.6.16), htmltools(v.0.3.6), GOSemSim(v.2.4.1), GO.db(v.3.5.0), gdata(v.2.18.0), magrittr(v.1.5), checkmate(v.1.8.5), memoise(v.1.1.0), BBmisc(v.1.11), cluster(v.2.0.6), limma(v.3.34.9), Biostrings(v.2.46.0), annotate(v.1.56.2), systemPipeR(v.1.12.0), prettyunits(v.1.0.2), colorspace(v.1.3-2), blob(v.1.1.1), ggrepel(v.0.7.0), dplyr(v.0.7.6), RCurl(v.1.95-4.10), jsonlite(v.1.5), TxDb.Hsapiens.UCSC.hg19.knownGene(v.3.2.2), graph(v.1.56.0), genefilter(v.1.60.0), bindr(v.0.1.1), brew(v.1.0-6), survival(v.2.41-3), sendmailR(v.1.2-1), glue(v.1.3.0), gtable(v.0.2.0), zlibbioc(v.1.24.0), XVector(v.0.18.0), UpSetR(v.1.3.3), Rgraphviz(v.2.22.0), scales(v.1.0.0), DOSE(v.3.4.0), pheatmap(v.1.0.10), DBI(v.0.8), edgeR(v.3.20.9), Rcpp(v.0.12.18), plotrix(v.3.7-3), htmlTable(v.1.11.2), xtable(v.1.8-2), progress(v.1.1.2), foreign(v.0.8-69), bit(v.1.1-12), Formula(v.1.2-2), AnnotationForge(v.1.20.0), htmlwidgets(v.1.2), httr(v.1.3.1), fgsea(v.1.4.1), acepack(v.1.4.1), pkgconfig(v.2.0.2), XML(v.3.98-1.10), nnet(v.7.3-12), locfit(v.1.5-9.1), labeling(v.0.3), tidyselect(v.0.2.4), rlang(v.0.2.2), reshape2(v.1.4.3), later(v.0.7.3), munsell(v.0.5.0), tools(v.3.4.3), RSQLite(v.2.0), evaluate(v.0.10.1), stringr(v.1.3.0), yaml(v.2.1.18), knitr(v.1.20), bit64(v.0.9-7), caTools(v.1.17.1), purrr(v.0.2.5), RBGL(v.1.54.0), mime(v.0.5), DO.db(v.2.9), biomaRt(v.2.34.2), rstudioapi(v.0.7), compiler(v.3.4.3), geneplotter(v.1.56.0), tibble(v.1.4.2), stringi(v.1.2.3), lattice(v.0.20-35), Matrix(v.1.2-12), pillar(v.1.2.1), data.table(v.1.11.4), bitops(v.1.0-6), httpuv(v.1.4.3), rtracklayer(v.1.38.3), qvalue(v.2.10.0), R6(v.2.2.2), latticeExtra(v.0.6-28), hwriter(v.1.3.2), RMySQL(v.0.10.15), promises(v.1.0.1), ShortRead(v.1.36.1), KernSmooth(v.2.23-15), gridExtra(v.2.3), boot(v.1.3-20), gtools(v.3.5.0), assertthat(v.0.2.0), DESeq2(v.1.18.1), Category(v.2.44.0), rprojroot(v.1.3-2), rjson(v.0.2.20), withr(v.2.1.2), GenomicAlignments(v.1.14.2), Rsamtools(v.1.30.0), GenomeInfoDbData(v.1.0.0), rpart(v.4.1-11), grid(v.3.4.3), rvcheck(v.0.1.0), shiny(v.1.1.0) and base64enc(v.0.1-3)